Search CORE

20 research outputs found

An Efficient Reliable Broadcast Protocol

Author: Bal H.E.
Flynn Hummel S.
Kaashoek M.F.
Tanenbaum A.S.
Publication venue
Publication date: 01/01/1989
Field of study

Many distributed and parallel applications can make good use of broadcast communication. In this paper we present a (software) protocol that simulates reliable broadcast, even on an unreliable network. Using this protocol, application programs need not worry about lost messages. Recovery of communication failures is handled automatically and transparently by the protocol. In normal operation, our protocol is more efficient than previously published reliable broadcast protocols. An initial implementation of the protocol on 10 MC68020 CPUs connected by a 10 Mbit/sec Ethernet performs a reliable broadcast in 1.5 msec

CiteSeerX

VU Research Portal

SiL: An Approach for Adjusting Applications to Heterogeneous Systems Under Perturbations

Author: C.P. Kruskal
CD Polychronopoulos
H Casanova
I Banicescu
JB Rawlings
LC Canon
R Mehrotra
RL Cariño
S Ali
S Browne
S Flynn Hummel
Publication venue
Publication date: 01/01/2018
Field of study

Scientific applications consist of large and computationally-intensive loops. Dynamic loop scheduling (DLS) techniques are used to load balance the execution of such applications. Load imbalance can be caused by variations in loop iteration execution times due to problem, algorithmic, or systemic characteristics (also, perturbations). The following question motivates this work: "Given an application, a high-performance computing (HPC) system, and both their characteristics and interplay, which DLS technique will achieve improved performance under unpredictable perturbations?" Existing work only considers perturbations caused by variations in the HPC system delivered computational speeds. However, perturbations in available network bandwidth or latency are inevitable on production HPC systems. Simulator in the loop (SiL) is introduced, herein, as a new control-theoretic inspired approach to dynamically select DLS techniques that improve the performance of applications on heterogeneous HPC systems under perturbations. The present work examines the performance of six applications on a heterogeneous system under all above system perturbations. The SiL proof of concept is evaluated using simulation. The performance results confirm the initial hypothesis that no single DLS technique can deliver best performance in all scenarios, while the SiL-based DLS selection delivered improved application performance in most experiments

arXiv.org e-Print Archive

Crossref

edoc

A programming environment for high-performance computing in Java

Author: Flynn-Hummel S.
Flynn-Hummel S.
Getov Vladimir
Getov Vladimir
Mintchev S.
Mintchev S.
Publication venue: Kluwer Academic / Phenum Publishers
Publication date: 01/01/1999
Field of study

WestminsterResearch

High-performance parallel programming in Java: exploiting native libraries

Author: Flynn-Hummel S.
Flynn-Hummel S.
Getov Vladimir
Getov Vladimir
Mintchev S.
Mintchev S.
Publication venue: 'Wiley'
Publication date: 01/01/1998
Field of study

With most of today's fast scientific software written in Fortran and C, Java has a lot of catching up to do. In this paper we discuss how new Java programs can capitalize on high-performance libraries for other languages. With the help of a tool we have automatically created Java bindings for several standard libraries: MPI, BLAS, BLACS, PBLAS and ScaLAPACK. The purpose of the additional software layer introduced by the bindings is to resolve the interface problems between different programming languages such as data type mapping, pointers, multidimensional arrays, etc. For evaluation, performance results are presented for Java versions of two benchmarks from the NPB and PARKBENCH suites on the IBM SP2 using JDK and IBM's high-performance Java compiler, and on the Fujitsu AP3000 using Toba - a Java-to-C translator. The results confirm that fast parallel computing in Java is indeed possible

Massively parallel computing in Java

Author: Flynn-Hummel S.
Flynn-Hummel S.
Getov Vladimir
Getov Vladimir
Mintchev S.
Mintchev S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1997
Field of study

Although Java was not specifically designed for the computationally intensive numeric applications that are the typical fodder of highly parallel machines, its widespread popularity and portability make it an interesting candidate vehicle for massively parallel programming. With the advent of high-performance optimizing Java compilers, the open question is: How can Java programs best exploit massive parallelism? The authors have been contemplating this question via libraries of Java-routines for specifying and coordinating parallel codes. It would be most desirable to have these routines written in 100%-Pure Java; however, a more expedient solution is to provide Java wrappers (stubs) to existing parallel coordination libraries, such as MPI. MPI is an attractive alternative, as like Java, it is portable. We discuss both approaches here. In undertaking this study, we have also identified some minor modifications of the current language specification that would make 100%-Pure Java parallel programming more natural

WestminsterResearch

Toward a Standard Interface for User-Defined Scheduling in OpenMP

Author: CD Polychronopoulos
CP Kruskal
FM Ciorba
I Banicescu
J Dongarra
L. Dagum
MR Garey
P Krueger
P Thoman
S Flynn Hummel
S Flynn Hummel
S Seo
TH Tzen
V Kale
V Kale
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Parallel loops are an important part of OpenMP programs. Efficient scheduling of parallel loops can improve performance of the programs. The current OpenMP specification only offers three options for loop scheduling, which are insufficient in certain instances. Given the large number of other possible scheduling strategies, standardizing each of them is infeasible. A more viable approach is to extend the OpenMP standard to allow a user to define loop scheduling strategies within her application. The approach will enable standard-compliant application-specific scheduling. This work analyzes the principal components required by user-defined scheduling and proposes two competing interfaces as candidates for the OpenMP standard. We conceptually compare the two proposed interfaces with respect to the three host languages of OpenMP, i.e., C, C++, and Fortran. These interfaces serve the OpenMP community as a basis for discussion and prototype implementation supporting user-defined scheduling in an OpenMP library

arXiv.org e-Print Archive

Crossref

edoc

JavaParty — portables paralleles und verteiltes Programmieren in Java

Author: E Jul
LV Kalé
M Philippsen
M Philippsen
N Nagaratnam
NJ Boden
P Launay
S Flynn Hummel
TE Anderson
TM Warschko
W Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1999
Field of study

Crossref

Evaluating the safety and efficacy of sodium-restricted/Dietary Approaches to Stop Hypertension diet after acute decompensated heart failure hospitalization: Design and rationale for the Geriatric OUt of hospital Randomized MEal Trial in Heart Failure (GOURMET-HF)

Author: Allen
Appel
Arcand
Bennet
Chaudhry
Chung
Felker
Flynn
Fonarow
Gheorghiade
Gottdiener
Gottdiener
Green
Gupta
Guyatt
Hummel
Hummel
Hummel
Jablonski
Jeffrey D. Wessler
Karanja
Klotz
Kosiborod
Kusaba
Lloyd-Jones
Mariotti
Marti
Mathew S. Maurer
Maurer
McMurray
Mellen
National Kidney Foundation
Nishimoto
Noumi
O'Connor
Opasich
Ott
Paterna
Ross
Scott L. Hummel
Spertus
Teerlink
Troyer
Wong
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

OpenMP Loop Scheduling Revisited: Making a Case for More Schedules

Author: CD Polychronopoulos
E Ayguadé
EP Markatos
H Bast
J Dongarra
M Durand
MR Garey
P Thoman
PH Penna
R Cammarota
RD Blumofe
S Flynn Hummel
TH Tzen
Y Wang
Publication venue: Springer International Publishing
Publication date: 01/01/2018
Field of study

In light of continued advances in loop scheduling, this work revisits the OpenMP loop scheduling by outlining the current state of the art in loop scheduling and presenting evidence that the existing OpenMP schedules are insufficient for all combinations of applications, systems, and their characteristics. A review of the state of the art shows that due to the specifics of the parallel applications, the variety of computing platforms, and the numerous performance degradation factors, no single loop scheduling technique can be a 'one-fits-all' solution to effectively optimize the performance of all parallel applications in all situations. The impact of irregularity in computational workloads and hardware systems, including operating system noise, on the performance of parallel applications, results in performance loss and has often been neglected in loop scheduling research, in particular, the context of OpenMP schedules. Existing dynamic loop self-scheduling techniques, such as trapezoid self-scheduling, factoring, and weighted factoring, offer an unexplored potential to alleviate this degradation in OpenMP due to the fact that they explicitly target the minimization of load imbalance and scheduling overhead. Through theoretical and experimental evaluation, this work shows that these loop self-scheduling methods provide a benefit in the context of OpenMP. In conclusion, OpenMP must include more schedules to offer a broader performance coverage of applications executing on an increasing variety of heterogeneous shared memory computing platforms

arXiv.org e-Print Archive

Crossref

edoc

Scalable and fast heterogeneous molecular simulation with predictive parallelization schemes

Author: A. Osprey
Christoph Junghans
D. E. Shaw
Horacio V. Guzman
Kurt Kremer
L. Flynn
L. Rudolph
L. V. Kale
M. P. Allen
S. F. Hummel
S. J. Chapin
S. Páll
Torsten Stuehn
Publication venue: 'American Physical Society (APS)'
Publication date
Field of study

Crossref